Goto

Collaborating Authors

 sparse matrix



d914a6c6d93c8df063b9099a076a488c-AuthorFeedback.pdf

Neural Information Processing Systems

Forinstance, one candefine anewupper bounding5 function that returns the minimum of two other upper bounding functions. The difficulty is in finding other upper6 bounding functions that are (1) tighter than the bound we use, (2) efficient to compute, and (3) empirically require7 few calls toRefine. Unfortunately,thefastcimplementation ofthe11 bound provided by [1] is numerically unstable for sparse matrices with 0 entries and the numerically stable matlab12 implementation is prohibitively slow. This difficulty could be overcome by rewriting an efficient implementation.13 This14 bound is computed by solving an optimization problem, but unfortunately we do not know of an efficient solution.15


Heterogeneous Point Set Transformers for Segmentation of Multiple View Particle Detectors

Robles, Edgar E., Sagar, Dikshant, Yankelevich, Alejandro, Bian, Jianming, Baldi, Pierre, Collaboration, NOvA

arXiv.org Artificial Intelligence

NOvA is a long-baseline neutrino oscillation experiment that detects neutrino particles from the NuMI beam at Fermilab. Before data from this experiment can be used in analyses, raw hits in the detector must be matched to their source particles, and the type of each particle must be identified. This task has commonly been done using a mix of traditional clustering approaches and convolutional neural networks (CNNs). Due to the construction of the detector, the data is presented as two sparse 2D images: an XZ and a YZ view of the detector, rather than a 3D representation. We propose a point set neural network that operates on the sparse matrices with an operation that mixes information from both views. Our model uses less than 10% of the memory required using previous methods while achieving a 96.8% AUC score, a higher score than obtained when both views are processed independently (85.4%).


Secure Sparse Matrix Multiplications and their Applications to Privacy-Preserving Machine Learning

Damie, Marc, Hahn, Florian, Peter, Andreas, Ramon, Jan

arXiv.org Artificial Intelligence

To preserve privacy, multi-party computation (MPC) enables executing Machine Learning (ML) algorithms on secret-shared or encrypted data. However, existing MPC frameworks are not optimized for sparse data. This makes them unsuitable for ML applications involving sparse data, e.g., recommender systems or genomics. Even in plaintext, such applications involve high-dimensional sparse data, that cannot be processed without sparsity-related optimizations due to prohibitively large memory requirements. Since matrix multiplication is central in ML algorithms, we propose MPC algorithms to multiply secret sparse matrices. On the one hand, our algorithms avoid the memory issues of the "dense" data representation of classic secure matrix multiplication algorithms. On the other hand, our algorithms can significantly reduce communication costs (some experiments show a factor 1000) for realistic problem sizes. We validate our algorithms in two ML applications in which existing protocols are impractical. An important question when developing MPC algorithms is what assumptions can be made. In our case, if the number of non-zeros in a row is a sensitive piece of information then a short runtime may reveal that the number of non-zeros is small. Existing approaches make relatively simple assumptions, e.g., that there is a universal upper bound to the number of non-zeros in a row. This often doesn't align with statistical reality, in a lot of sparse datasets the amount of data per instance satisfies a power law. We propose an approach which allows adopting a safe upper bound on the distribution of non-zeros in rows/columns of sparse matrices.


Neural Sum-of-Squares: Certifying the Nonnegativity of Polynomials with Transformers

Pelleriti, Nico, Spiegel, Christoph, Liu, Shiwei, Martínez-Rubio, David, Zimmer, Max, Pokutta, Sebastian

arXiv.org Artificial Intelligence

Certifying nonnegativity of polynomials is a well-known NP-hard problem with direct applications spanning non-convex optimization, control, robotics, and beyond. A sufficient condition for nonnegativity is the Sum of Squares (SOS) property, i.e., it can be written as a sum of squares of other polynomials. In practice, however, certifying the SOS criterion remains computationally expensive and often involves solving a Semidefinite Program (SDP), whose dimensionality grows quadratically in the size of the monomial basis of the SOS expression; hence, various methods to reduce the size of the monomial basis have been proposed. In this work, we introduce the first learning-augmented algorithm to certify the SOS criterion. To this end, we train a Transformer model that predicts an almost-minimal monomial basis for a given polynomial, thereby drastically reducing the size of the corresponding SDP. Our overall methodology comprises three key components: efficient training dataset generation of over 100 million SOS polynomials, design and training of the corresponding Transformer architecture, and a systematic fallback mechanism to ensure correct termination, which we analyze theoretically. We validate our approach on over 200 benchmark datasets, achieving speedups of over 100 compared to state-of-the-art solvers and enabling the solution of instances where competing approaches fail. Our findings provide novel insights towards transforming the practical scalability of SOS programming. Since this expression is positive when γ < 1 and zero when γ = 1, we conclude that γ = 1 is the global minimum of the polynomial. While this example admits a straightforward algebraic verification, deciding nonnegativity is NP-hard in general, even for simple quartic polynomials such as the one above, which motivates the use of convex relaxations (Ahmadi et al., 2011). The method guarantees correctness: if a SOS certificate exists, it will be found; otherwise, infeasibility is certified at the full basis.



Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper introduces a fast CCA algorithm for dealing with large-scale data. Through exposing that CCA can be solved via iterative LS, a fast LS algorithm is naturally employed to speed up the CCA calculation. Error analysis for the proposed fast CCA algorithm is also provided. The experiments on large-scale datasets evaluate and compare different fast CCA methods, which are comprehensive and convincing. The paper is well written and easy to follow.


Adaptive Ensemble Learning with Gaussian Copula for Load Forecasting

Yang, Junying, Lu, Gang, Yan, Xiaoqing, Xia, Peng, Wu, Di

arXiv.org Artificial Intelligence

Machine learning (ML) is capable of accurate Load Forecasting from complete data. However, there are many uncertainties that affect data collection, leading to sparsity. This article proposed a model called Adaptive Ensemble Learning with Gaussian Copula to deal with sparsity, which contains three modules: data complementation, ML construction, and adaptive ensemble. First, it applies Gaussian Copula to eliminate sparsity. Then, we utilise five ML models to make predictions individually. Finally, it employs adaptive ensemble to get final weighted-sum result. Experiments have demonstrated that our model are robust.



The Ubiquitous Sparse Matrix-Matrix Products

Buluç, Aydın

arXiv.org Artificial Intelligence

Multiplication of a sparse matrix with another (dense or sparse) matrix is a fundamental operation that captures the computational patterns of many data science applications, including but not limited to graph algorithms, sparsely connected neural networks, graph neural networks, clustering, and many-to-many comparisons of biological sequencing data. In many application scenarios, the matrix multiplication takes places on an arbitrary algebraic semiring where the scalar operations are overloaded with user-defined functions with certain properties or a more general heterogenous algebra where even the domains of the input matrices can be different. Here, we provide a unifying treatment of the sparse matrix-matrix operation and its rich application space including machine learning, computational biology and chemistry, graph algorithms, and scientific computing.